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ABSTRACT 

Optimizing performance on work activities and processes requires metrics of performance for 
management to monitor and analyze in order to support further improvements in efficiency, 
effectiveness, safety, reliability and cost. Information systems are therefore required to assist 
management in making timely, informed decisions regarding these work processes and activities. 
Currently information systems regarding Space Shuttle maintenance and servicing do not exist to 
make such timely decisions. The work to be presented details a system which incorporates 
various automated and intelligent processes and analysis tools to capture organize and analyze 
work process related data, to make the necessary decisions to meet KSC organizational goals. 

The advantages and disadvantages of design alternatives to the development of such a system 
will be discussed including technologies, which would need to bedesigned, prototyped and 
evaluated. 
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Intelligent Work Process Engineering System 
Kent E. Williams 


1. INTRODUCTION 

The behavior of an organization is motivated by its goals. The processes, activities and materials 
required to meet organizational goals, in turn, define organizational behavior. These activities 
and processes are sometimes referred to as the value chain of an organization. The value chain 
activities are critical to the successful performance of the organization in meeting their goals and 
objectives. The goals of the KSC organization are to launch and recover space vehicles in the 
service of the greater NASA mission. These goals are achieved as a result of the various work 
processes and activities performed at KSC and other supporting facilities both contractor and 
government. Consequently, these activities and processes must be performed; safely, reliably, 
efficiently and effectively while optimizing cost and the predictability of performance of space 
vehicles. Optimizing performance on these work activities and processes requires metrics of 
performance for management to monitor and analyze to support further improvements in 
efficiency, effectiveness, safety, reliability and cost. Being able to monitor and anlyze these 
metrics will further enhance KSC operations in the service of their mission. Information 
systems are therefore required to assist management in making timely, informed decisions 
regarding these work processes and activities. Currently information systems regarding Space 
Shuttle maintenance and servicing do not exist to make such timely decisions regarding the 
specific activities, which must be performed to ensure the safety, reliability, efficiency and 
predictability of Shuttle processing. 

As a case in point, a report identifying Root Cause for Space Shuttle Operations and 
Infrastructure Costs, was recently developed by McCleskey [1]. His analysis found that ~ 25% 
of Direct Work costs were categorized as Unplanned Troubleshooting and Repair activities and 
another ~ 24% of Direct Work costs were associated with Vehicle Servicing. However the 
specific activities making up these servicing and repair tasks could not be readily identified. The 
magnitude of these unplanned activities relative to the totality of direct labor, reflects 
uncertainties and risks in the design of the vehicle. This indirectly impacts the cost for 
Operations Support, Logistics, Sustaining Engineering, Safety Reliability and Quality Assurance, 
and Flight Certification, all of which are hidden costs. These hidden costs most importantly 
make up the greatest percentage of recurring operational expenditures. 

If data were captured and organized relative to the tasks making up work instructions, one could 
conduct a deeper analysis of unplanned as well as planned vehicle processing activities. 
Management could address work process changes and design changes to reduce costs and risk in 
the future while improving upon the efficiencies of processing the Shuttle. Start-stop time data 
associated with tasks related to differing Operational Functions and Design Disciplines could be 
captured and analyzed to pinpoint specific problems. These problems could then be targeted for 
improvement in process and or design. This data could also be used to develop models of future 
design alternatives to make predictions concerning budgetary requirements and the reliability of 
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designs. 


2.0 BACKGROUND KNOWLEDGE: MANAGEMENT INFORMATION 
REQUIREMENTS AND TOOLS 

Management is decision-making and problem solving. In order for management to make 
informed decisions about the performance of activities regarding the safety, efficiency, reliability 
and costs of operations, data must be gathered and translated into information regarding these 
dependent measures. Numerous methods and technologies (i.e. tools) have been developed for 
industry to conduct the needed analyses for management to make decisions regarding these 
measures. One of the classical techniques employed is that of the control chart. There are 
numerous forms of the control chart [2J. In the abstract, however, the control chart is a data 
mining tool which plots a measure of performance as a function of time. The chart is segmented 
into three boundaries called the centerline which represents the mean value of the measure of 
interest, the upper control limit and the lower control limit. The upper and lower control limits 
typically represent values which represent some number of standard deviation units above and 
below the mean value represented by the centerline. As the measure of performance is plotted 
over time one can see how this measure is varying. 

Of specific interest, however, is when the measure starts moving toward either the upper or 
lower boundary. This means that the process is out of control. A visual inspection of the chart 
will show a slope deviating from a straight horizontal line moving toward one of these 
boundaries. One can glance at the chart and judge how quickly the process being measured will 
go out of control by looking at the magnitude of this slope. Figure 1.0 shows a control chart and 
a trend developing, which would indicate that the process is going out of control. Exceeding one 
of these boundaries indicates that there is a failure in the process or in the material being 
measured. This kind of information can be used to prevent or alert operators to the potential 
failure of a system. Armed with such information management can then make decisions 
regarding system operability and can trigger an analysis of the causes for the system going out of 
control. Hopefully, this will preclude any future problems in the operation of the system. 



Figure 1 .0 Generic depiction of a process control chart showing a trend which is exceeding the 
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upper control limit of + 3 standard deviations (sd) above the mean. 

This type of data analysis can be employed to plot and track the variation of an activity about its 
expected value in terms of time to complete that activity or to track the performance of a 
component or system about some expected value indicative of its health. Moreover costs to 
perform specific activities can be plotted to determine if expected costs are on track, below 
expectations or exceeding expectations. Other statistical regression analysis techniques can 
equally well be applied to the types of dependent measures reflective of safety, reliability, 
efficiency and cost. These techniques can be used to make predictions regarding the time to 
failure of a component or the time to complete a process given historical data from past records 
[3], Madigan and Ridgeway [4] have also described Bayesian analysis techniques for making 
such predictions and for modeling processes for which prior probability distributions on work 
metrics are available. 

Another type of analysis typically performed by management especially when monitoring 
organizational performance is that of root cause analysis or drill down. Root cause analysis is a 
technique for identifying the source or the cause of a specific problem identified. In 
management information system terms, root cause analysis is performed by drilling down into 
the various layers of information recorded regarding specific activities of the organization to 
identify the source of a problem. This is typically a lower level of analysis than that conducted 
employing control charts of a specific process or system performance. Whereas control charts 
can indicate that a process or system is about to go out of control, indicating that a special cause 
for the loss of control is apparent, control charts and trend analyses do not specifically identify 
the cause or source of the problem. Identifying the source or cause of the problem requires 
further in depth analysis of the process or system. However with well specified data, recorded at 
some basic unit of analysis, management can perform root cause analyses on organizational 
activities employing a drill down capability. 

As an example given the analysis performed by McCleskey regarding the apportionment of 
dollars to direct labor, he found that ~ 50% of the direct labor costs were associated with 
unplanned activities. However, these unplanned activities could not readily be identified. 
Consequently, management has still not identified the source or deeper cause for such costs. In 
essence such costs still remain unaccounted for. However, if data could be captured at a basic, 
fundamental unit of analysis for work processes, one could identify precisely what tasks are 
responsible for this large portion of direct labor costs. For example, if data regarding the tasks of 
a work instruction along with the context with in which the task is performed were captured, 
such deeper level analyses and ultimate identification of the source of such costs could be 
specified. Data related to the technical staffing requirements to perform such tasks, the historical 
safety record on task related work, the time to complete a task, the materials required for the 
completion of the task, the ground service equipment required for the task, any associated quality 
assurance support, hazard and safety considerations, as well as, environmental facility 
requirements and associated costs could be captured and stored in a data warehouse. Armed with 
this store of data relative to this basic unit of analysis, the task unit and its context, management 
could readily perform root cause analyses to answer any questions regarding potential deviations 
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from expected operations along with safety and health concerns. 

A third class of management tools, which has recently received broad attention and application 
in industry is that of simulation and modeling of industrial and organizational processes [5], 
These modeling tools allow managers to create chains of input-output processes which reflect the 
underlying activities of the organization. The inputs to the processes and the processes 
themselves are the independent variables, which can be manipulated by management. The 
outputs are the effects of the work processes or the transformations imposed upon the inputs. 

That is, the results of work performed on the inputs. These outputs are typically referred to as 
the dependent variables and can be measured to reflect the performance of the processes imposed 
upon the inputs. What is unique about such tools is that management can experiment with 
different organizations of tasks within a process, can modify specific processes by introducing 
new technologies or can change the values of input parameters and receive information regarding 
the likely impact of such changes upon the performance of the organization. Such tools can also 
demonstrate where the bottlenecks to organizational performance occur and allow for a deeper 
analysis of the underlying cause and effect relationships inherent in any model of the 
organization, process or system. Such simulation and modeling tools have effectively been used 
to make predictions regarding performance. As a relevant case in point, all of the space missions 
executed by NASA in one form or another were the products of simulation and modeling tools. 
The Department of Defense has also directed all services that any future defense program 
procurements will require that competing contractors supply simulations and models of the 
operation of the engineering systems to be designed and developed along with models of life 
cycle development processes for the procured system. 

Given the basic unit of analysis, the task and its context, along with the associated data which 
could be captured for the data warehouse, models can be developed and executed to make 
numerous predictions. For example alternative sequences of tasks can be modeled to determine 
if any improvement in efficiency and or schedule can be achieved. Information regarding 
materials required for the conduct of tasks, the location of task performance, the labor and 
support required and the schedule of tasks can be used to make predictions which can effect the 
supply chain of operations at KSC. Such process simulations can supply management with an 
experimentation platform from which to evaluate improvements which may be associated with 
proposed changes to operations to foster continuous process improvement. 

2.1 Data Requirements for Management Decision-Making To Meet Organizational Goals 

In order for management to make decisions based upon the categories of analyses discussed 
above a basic unit of analysis had to be decided upon which could then be subjected to the 
various analysis methods and tools. That unit of analysis was determined to be the task level or 
step level unit embodied in a work instruction. This was found to be the lowest unit of analysis 
about which various data elements could be recorded during the execution of a work instruction. 
This unit of analysis was associated with the various data relevant to making the kinds of 
decisions required on the part of management in order to meet organizational goals. 

Additionally, a scheme for organizing the needed data was developed such that various analyses 
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could be carried out relevant to processing the Shuttle orbiter. 

The results of the investigation of data required is presented in Figures 2. 0-6.0. The 
investigation of data requirements found that most if not all data required for the conduct of these 
various categories of analysis could be found in the Operational and Maintenance Instructions 
(OMI) and their associated runs as well as in the Problem Reporting and Corrective Action 
(PRACA) data base. The data required was then placed into a structure characterizing this basic 
unit of analysis, the work step or task, along with its context. That is, this structure defines the 
work step and the context in which the step is performed. This structure or instance regarding 
the basic unit of analysis could then be used to organize the data and yield the needed 
information for the conduct of various analyses for management decision-making. The analysis 
yielded the following data structure. 


I OMI ID# 

2. Step Name 

3. Requirement Addressed 

4. Operational Function 

5. Design Discipline Required 

6. Materials Required Yes/No 

7. Special Consideration Yes/No 

8. Quality Assurance Requirement Yes/No 

9. Environmental Requirement Yes/No 

10. Safety Requirement Yes/No 

I I Accident Report 

12. Deviation to Work Instruction 

13. P/FRACA 

14. Initial Problem Report 

1 5 Materials Costs 

16. Skill Codes 

17. Start Time 

18 . Stop Time 

19. Date Performed 

20 . Location Of Work Performed 

The first ten rows represented in this data structure are used to classify an instance or the basic 
unit of analysis. They define the context in which the work is performed and identify the basic 
unit as belonging to an Operational and Maintenance Instruction (OMI) ID number, row 1 . . 
Within the OMI the basic unit is also given a Step Name, row 2., describing what task must be 
performed. The remaining rows, 3-10, of this block characterize the work step to be performed 
as a set of attributes which can take on a finite set of values either Yes/No or some set of nominal 
values as in Operational Function and Design Discipline Required. These attributes and their 
values were designed in a manner consistent with the way in which management analysts 
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typically organize their thinking regarding a work step instruction. This classification scheme 
was developed as a result of interviews with individual stakeholders who would be the potential 
end users of this kind of management information. 

Rows 11-20 consist of data which is gathered as a result of the actual execution of the work step. 
This is the data that will be used to perform the various types of analyses discussed for 
management decision-making. Figures 2.0-4.0 present data flow diagrams specifying how the 
various data elements within the data structure will be used in the various types of analyses 
discussed. Additionally, Figure 5.0 indicates how this data may also be used to feedback 
information regarding Lessons Learned, modifications to Training Requirements and Work 
Requirements. 

2.2 Current Status of Data Generation and Collection Relative to Shuttle Work Processes 

Currently, work instructions and their associated data are recorded manually on hard copy paper 
sheets detailing the work to be performed. This information is then scanned into a document 
store. No further processing into an electronic format is generated relative to this information. 
Problem reports on the other hand are first collected on hard copy forms and then transformed 
into and electronic data base which can be readily accessed for analysis. As a result of this 
current state, the collection, gathering and organization of the required data to conduct the 
analyses for management decision-making in large part is none existent. Needed analyses and 
information organization would have to be conducted with considerable human interaction. A 
major constraint then is the absence of an electronic format in which the needed data can be 
captured, organized and analyzed. Therefore although the data exists for such analyses to be 
conducted it is currently in an inert form. Any analyses to be performed for management 
decision making is conducted with considerable time and effort on the part of the analyst. 


3. Proposed Alternative to the Current System 


As an alternative to the current lack of information systems to support management decision- 
making regarding work processes and work process engineering, a paperless work processing 
system is proposed. Such a system would include a work authoring tool, electronic work control 
and instruction execution capability, a data warehouse which can filter needed information and 
automatically organize this information for analysis purposes, process analysis tools for 
analyzing work results and a feedback loop to incorporate changes that could help reduce cost 
and improve on work efficiency, scheduling, safety and resource allocation This system could 
model work processes to make predictions regarding; the costs and schedules associated with 
alternative vehicle design configurations, the reliability and safety of a work processes, time to 
execute a work process, utility of a work process, cost of implementing a work process' in 
material, support and labor. In short such a system would allow the various categories of 
analysis to be conducted for management decision-making. 
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The benefits of such a systems would allow management to: monitor and set control limits for 
work related measurements, identify uncertainties in work processes as targets for continuous 
process improvement, identify excessive costs related to work processes, identify root causes for 
operational costs and hazards, predict actual time and costs for processing orbiter and for 
estimating operational budgets, provide needed information for analysis of design inefficiencies, 
provide feedback for developing and updating work requirements, training requirements, lessons 
learned, and safety in the performance of work and the integrity of the orbiter. 

3.1 Description of Proposed Work Process Engineering System 

3.1.1 Work Instruction Authoring 

The alternative system proposed to meet the needs of management is graphically depicted in 
Figure 6 . 0 . This system design presupposes that the current process for recording and storing 
data relative to work processes, lends the data inaccessible by way of standard electronic 
processes. Consequently this alternative would begin processing work instructions with the aid 
of an intelligent work instruction authoring system. This system would place all work 
instruction information into an electronic format capable of being stored in accordance with the 
data structure outlined above. Work instructions and their component steps would be uniquely 
authored or retrieved in whole from existing data sources which store the components of the 
work instructions. The work instructions could also be subjected to modification if needed by 
the user. New work instructions could also be developed and designed in accordance with 
accepted human factors principles by accessing a task analysis system. The task analysis system 
would guide the user through a task analysis process and then access other tools to assess the 
human factors issues, which must be addressed in the design of the work instruction. Upon 
completion of a work instruction all of the information regarding the attributes of the work unit 
would have been provided for sorting in the Work Unit Warehouse. 

Issues which must be addressed regarding the authoring of work instructions would include but 
not be limited to: the human interface, the structure of the interview process to elicit information 
from the user, the format and media to be employed relative to differing types of information to 
be communicated to personnel performing the work process, the human engineering principles 
which must be accessed and implemented employing the human factors tool kit, the differing 
types of intelligent search routines required to gather stored information when needed to 
facilitate the construction of an OMI and other intelligent systems required to store work 
instructions and their component steps in a data warehouse which is self-organizing and 
adaptive 

3.1.2 Electronic Work Instruction Distribution 

The output from the authoring system would be transmitted electronically to personnel 
responsible for scheduling the execution of the work process and to a data base of work 
instructions. The instruction could alternatively be transmitted to an intelligent work scheduling 
system which could automatically schedule the work activities, the supply of needed materials. 
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ground service equipment, environmental facilities where the work would be performed, the 
necessary technical crew and support personnel. The information upon which such decisions 
could be made would be contained in the first block of attributes describing the task step and 
context for the work unit as well as from data contained in the data warehouse. 

The needed personnel, facilities, material requirements and schedule for the work unit along with 
the work instructions, could then be received via wireless transmission by the appropriate 
personnel by way of a personal digital assistant (PDA). Personnel could then enter the raw data 
relative to performance on the task during task execution employing this PDA. Data recorded 
would consist of that specified in the second block of fields in the data structure designed. 
Information regarding problem reports could also be entered by this system such that it could be 
transmitted, received and stored in the existing PRACA data base electronically. Consequently 
upon completion of the work instruction, the information needed to categorize a work unit or 
work step and the data associated with the performance of the work step would be available for 
storage, organization and retrieval in the Work Unit Warehouse of information. 

3.1.3 Work Unit Warehouse 

The Work Unit Warehouse would provide managers with the needed store of information and 
data required to conduct the various analyses for management decision-making. The warehouse 
would consist of an interface for users to create their own organizations of data and information 
if desired, as well as, machine learning algorithms which would automatically cluster 
information contained in the basic work unit data structure specified or some other unit of 
analysis specified by a user. That is, a user could construct their own database organized 
differently than that specified by the data structure defined. This of course is contingent upon 
the fact that the information and data is contained in the work instruction database or other 
databases, which could be accessed by the warehouse routines. 

In order to develop such a warehouse an appropriate machine learning algorithm must be 
provided to automatically retrieve, sort and classify information relative to the basic unit of 
work, to be subjected to various analyses. One potential algorithm is CLASSIT developed by 
Fisher [6], CLASSIT is an incremental concept formation algorithm, which takes instances 
made up of a set of attributes and their associated values and automatically finds the best 
clustering or organization of these instances. For the case at hand, CLASSIT would sort and 
organize work unit information based upon the attributes and their associated values as specified 
in the basic unit of analysis defined. The categories, which evolve after processing numerous 
instances, would represent classes of information, which are inherent in the instances fed to the 
algorithm. What is unique about CLASSIT is that the conceptual clusters that are formed 
correlate extremely well with those categories, which would have been developed by humans 
given the same instances. Given this current design each instance, which is fed to CLASSIT 
would also be associated with a record which identifies all of the data recorded as a result of the 
execution of a work unit. So for example, if one wanted to determine how many PRACA reports 
were filled out relative to a specific Design Discipline, one could query CLASSIT and retrieve 
all of the problem reports associated with that Design Discipline. Any questions regarding the 
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attributes or a work unit or the data associated across work units could be retrieved with this 
system. All of the analyses specified for management to make their decisions regarding the 
goals of the organization could be conducted given the data stored in this warehouse and the 
organization of the data provided by the algorithm. The advantage is that no predefined structure 
for the database of information in the warehouse is required. This provides the needed flexibility 
for organizing information in various ways, to meet managers information needs. Information 
could essentially be sorted and organized based upon any combination of attributes and data 
stored as defined by the basic unit of analysis. 

The features of CL AS SIT, which are of importance for organizing information, are presented in 
the following. CLASSIT differs from other algorithms in that it can not only organize 
information automatically, but it can modify its organization based upon new instances to be 
sorted. That is, it is self-organizing. CLASSIT operates in an unsupervised fashion. It does not 
require any feedback as to the goodness of fit of the categories it forms, unlike many other 
algorithms, which do require some form of external feedback. Due to the incremental nature of 
this classification algorithm initial instances may bias the clustering of new or future instances to 
be processed. However, CLASSIT continuously evaluates a current organization of information 
such that it can form new classes, merge existing classes and split existing classes to improve 
upon its ability to discriminate between instances, which do or do not belong together. This is 
accomplished by way of a category utility measure. The expression for category utility is based 
upon conditional probabilities and is expressed as follows: 


Eq[l] 


Z />(C k )2E P(A,= Vij|C t ) 2 -ZI P (Ai = Vjj) 2 

k=l i j i j 


K 


If new attributes are to be added to an instance or deleted from an instance, CLASSIT has the 
capability to modify an existing organization to form a new organization of the information 
based upon the addition or deletion of attributes and their associated values. CLASSIT is also 
one of the only algorithms, which was designed to simulate human performance on classification 
tasks. The resultant classifications therefore would be most compatible with human 
organizational processing of the information. This algorithm can handle missing data and 
missing attribute values since it is stochastic in nature. It can accommodate nominal as well as 
quantitative values. 
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CLASSIT forms categories of instances in a hierarchical fashion with the most general classes of 
information toward the top of the hierarchy and the more specialized categories toward the 
bottom of the hierarchy. Each specific instance would be stored as a singleton class under its 
parent category. An instance may also belong to more than one different category at any given 
level of organization if the instance cannot be clearly discriminated. This is called clumping and 
has also been observed in human information processing. 

The category utility function of equation [1] is used to classify a new instance into an existing 
class, to create a new class (i.e. a singleton) to combine two classes into a single class (i.e. 
merging) or to divide a class into several classes (i.e. splitting). When a new instance is to be 
sorted it is first placed in the root node. At this node as in all other nodes, the system computes 
the probability of an instance occurring at that node. This probability yields the P (Ck) value, 
which represents the probability that an instance would belong to any given node, which has 
been created. CLASSIT also computes a conditional probability for each attribute and its 
associated value. This is represented as 


22 P(4=Vij|C k ) 2 - 

1 j 

This is the sum of the probabilities of the attributes having specific values given that they belong 
to a specific class. It is referred to as the predictability of attribute values belonging to a given 
class. If an instance and its associated attribute values match the probabilities associated with 
attribute values of an existing class then there is a high probability of that instance belongs to 
that class. 

However this conditional probability should be adjusted by the probability of the attribute and its 
value occurring independent of any specific class. This is called predictiveness and is 
represented as 


2 2 m=V f ) 2 

' j 

That is if an attribute and its value has a high probability of occurrence independent of 
membership to any specific class, then it is not providing much information regarding class 
membership since it is highly likely to occur independent of class membership. The number of 
classes, which the category utility function is evaluating is represented as K. 

Once an instance is placed in the root node it is then passed to each child of that root node and 
the measure of category utility is applied to each child to determine to which node the new 
instance most likely belongs. If there is a match to any of these children which ever has the 
highest score will retain the new instance. If none of the children match closely, then CLASSIT 
will consider forming a new singleton class based upon the category utility measure. If two or 
more of the children match the instance closely, then CLASSIT will consider merging the 
instances together into a single class. CLASSIT also considers the inverse operation of splitting 
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nodes. If CLASSIT decides to classify an instance with an existing category it also considers 
removing the category and making the instances in that category children of that categories 
parent. That is the category is removed and the instances, which belong to that category are 
elevated to become directly linked to that categories parent. In essence they now become 
candidates for new categories if the category utility function indicates an improved clustering. 
This entire process is iterated for each node that CLASSIT visits during its attempt to sort a new 
instance. In this way CLASSIT is continuously modifying its organization based upon the 
probabilities calculated with the introduction of a new instance into the system. 

3.1.4 Search Agents 

The Work Unit Warehouse could also contain search agents, which would search the Work 
Instruction Data Base and retrieve the information required by the basic unit of analysis. This 
would then create the instances required for sorting by C1ASSIT in the Work Unit Warehouse 
organizing the information for analysis by managers. If the data regarding the basic unit of 
analysis is captured and stored employing an Intelligent Work Instruction Authoring system and 
an electronic mode for capturing data relative to the execution of the work instruction, these 
agents could be easily designed. Since the data regarding a work instruction and its execution is 
captured in an electronic format with a known structure, simple If-Then rules could be created to 
retrieve the necessary data for insertion into the slots of the basic unit of analysis. This would 
require that someone knowledgeable about the structure of the Work Instruction Data Base and 
the information needed for insertion into the basic unit of analysis, create the simple If-Then 
rules. Rules could be developed for the retrieval of any element of information in the Work 
Instruction Data Base. Data elements could then automatically be retrieved and inserted in their 
appropriate slots in the basic work unit data structure. Again this scheme presupposes that data 
regarding the basic unit of analysis be captured and recorded electronically and placed into a 
structured database. 

3.1.5 Advantages of Design 

The advantages of this design is that the cost to generate work instructions would be reduced due 
to the facilities provided in the authoring tool which would automatically search and retrieve the 
needed information for the user. A paperless format would also eliminate the costs for archiving 
and distributing multiple hardcopies of work instructions to the appropriate recipients for 
management and execution of the work instructions. The system would also supply the user with 
the flexibility to organize data and information in accordance with the users conceptualizations, 
dynamically, in real time. 

In the absence of a data warehouse requests for alternative organizations of information would 
require that information systems personnel specialized in the interfaces of various data bases 
write routines to gather and organize the data in accordance with a request. This could take days 
if not weeks to achieve dependent upon the personnel resources available. Moreover, the self- 
organizing properties of CLASSIT would reduce the costs accrued in developing a model for 
database storage and retrieval. The traditional approach to database design can be an expensive 
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process requiring considerable analyses of user needs and information storage and retrieval 
requirements. Once a structure for a database model is settled upon the structure of the database 
is fixed. Any modifications would require considerable effort. This would all be eliminated 
employing CL AS SIT. 

3.1.6 Disadvantages of Design 

The disadvantages of this proposed design would include the cost to develop an intelligent work 
authoring instruction system, the costs to implement wireless communications, PDA hardware 
and software, the cost of a data base management system for the Work Instruction Data Base, 
and the costs to integrate the CL AS SIT algorithm and associated agents with the Work 
Instruction Data Base. A user interface for the Work Unit Warehouse would have to be designed 
and developed. Interfaces connecting The Work Unit Warehouse with the various analysis 
packages for management decision-making would need to be developed. Lastly any 
modifications as to the way in which work is performed relative to the status quo would probably 
result in change order requests on the part of the contractor associated with increased costs. 

3.2 An Alternative Approach 

An alternative effort to that proposed in the above would make use of the current system for 
processing work instruction documents and executing work instructions. This alternative would 
also use existing document and data stores to capture the needed information required for 
constructing the basic work unit data structure. The development of the work unit warehouse 
would consist of the same effort as that described in the above with the exception that research 
would need to be conducted in the development of semantic search agents to identify and 
retrieve information regarding the attributes of the data structure and the raw data associated with 
the execution of the work unit. 

3.2.1 Semantic Search Agents 

Current search agents employ a simple key word search for information stored in various 
databases. These key words can form conjunctions of terms as well as disjunctions of terms. 

They are simple Boolean Logic expressions. The number of terms is also typically constrained to 
two or three terms making up an expression. The search simply consists of recognizing the words 
making up the expressions and retrieving information which matches the key word expressions. 
The underlying meaning of the expression used for the search is not sought. In order to conduct 
such meaningful searches the technology currently employed is natural language processing 
(NLP). This technology takes as an input a grammatical sentence, which represents the meaning 
of the information which is the target of the search. In order to establish the meaning of the 
expressions, the system must first parse the sentence grammatically and then interpret the 
meaning of the words making up the sentence to understand the query. Once the target of the 
query is understood the system can conduct a search and retrieve the desired information. The 
problem is that the words used in the query must be represented by the system in advance such 
that an understanding of the expressions can be identified. Considerable knowledge engineering 
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is required to develop such systems. For general purpose applications such systems are cost 
prohibitive if they can even be developed to work. 

On the other hand a system could be designed which interviews the user and elicits from the 
user, his/her semantic representation of the target of the search. This approach would not require 
any form of natural language processing and would place the burden of establishing the 
meaningful representation of the query upon the user interacting with the system. The challenge 
then would be to design an interface, which could elicit such meaningful representations from 
users. If this can be achieved then, general purpose searches can be made of existing data 
sources to retrieve the needed information. This is necessary since in many cases users searching 
for information would not know the contents of the many legacy data bases which exist. The 
user could engineer his/her understanding of the meaning of the information, which is the target 
of the search, launch an agent with this representation and receive responses which identify 
sources of data consistent with the users meaningful representation. It would then be up to the 
user to decide if a particular data store would be of interest. The user could browse through data 
store retrieved to determine if specific elements of information are located there. If so then the 
site would be flagged to indicate that the user request is associated with the specific data store 
and a rule would be coded which points to that site given the representation provided by the user. 

This would not require the creation of new data capture technologies and a new data base 
management system as would be the case with the original proposal. The needed data could be 
retrieved, encoded and stored to conduct the analyses of concern employing the data warehouse. 
Such a capability is a compromise between a keyword system and a natural language processing 
system. Keyword searches are limited in that they consist of simple Boolean expressions and do 
not have associations to equivalent forms of these expressions. Natural language processing on 
the other hand, requires considerable engineering of the complete vocabulary of the domain of 
interest along with complex parsing mechanisms. Such systems are still limited in terms of their 
ability to resolve ambiguous expressions entered by the user. Consequently, a system, which 
elicits from a user the representation of the users meaningful representation of a search topic, 
would limit the necessity for engineering an entire domain vocabulary. Also the need for 
complex understanding and generation of English sentence expressions as is the case with NLP, 
would be eliminated. In time various search agents would be developed by differing users within 
the organization and made available to other users in a library of search agents. 

3.2.2 The Representation of Meaning 

In order to design such a system however, one must understand the nature of human associative 
memory, which is the seat of meaningfulness. Since the meaningfulness of something is the 
product of an individuals stored experiences in memory, in order to understand meaning one 
must examine the structure of information in human memory. The approach taken herein was 
first to collect the works of the foremost researchers and theorists with respect to human memory 
and the representation of meaning, second, to examine their thinking regarding the representation 
of meaning in memory, third, to find some common ground amongst these theorists and fourth to 
come up with a set of guidelines or principles which could guide the design of an interface for 
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eliciting meaning from a user. 

The primary sources for theoretical and empirical research were Anderson [7], [8] and [9], 
Anderson and Bower [10], Quillian [11] and [12] , Kintsch [13] and [14] and Ausubel [15]. 
These investigators are considered to be the most prominent in the field of human memory. In 
common to all is their agreement on the obvious fact that memory is associative and therefore 
meaning is represented by a variety of associations in memory. Therefore to understand 
meaning one must examine the types of associations, which can be formed in human memory. 

Of particular importance in defining how meaning is represented is the classification scheme for 
meaningful representations espoused by Ausubel [15]. True of all disciplines, either arts, 
sciences or engineering, in order to understand a subject matter classification comes first. 
Ausubel [15] has proposed a threefold classification of types of meaning. The first and most 
basic form of meaning within this classification is that of representational meaning. 
Representational meaning is defined as words or symbols which represent corresponding objects. 
This is typically what takes place when one is learning a vocabulary. It involves rote learning, 
that is the simple assignment of a name to an object. Representational meaning must come first 
in that the individual must have a name for something that s/he is referring to. Representational 
meaning involves supplying that name for something. It establishes an equivalence between a 
verbal symbol and a referent. There may also be different verbal symbols which establish the 
same equivalence relations. That is different words referring to the same thing. Representational 
meaning however is not flexible or general in that it has a very specific referent. 

The next type of meaning is that of concept meaning. Concept meanings are generic or 
categorical in nature. They are general and they are flexible. They are ideas, which like 
representational meaning have verbal symbols, but the symbols have no specific referent. That 
is, the meaning of the symbols represents an entire class of instances or things, which share some 
common attributes. These attributes provide distinguishing characteristics, that provide the 
meaning for the concept. Concept meaning is typically abstracted by experiencing many 
instances of something and recognizing similar features between these instances. The human 
information processing system performs this kind of function naturally. For example a child 
who first experiences a ball object may assign the representational meaning of ball to that 
specific object. That meaning however only refers to that specific ball and no other types of 
balls. With further experience, the child sees other round like objects, of differing colors and 
sizes which can be manipulated in the same way as the first ball experienced and abstracts 
common criterial features. The word ball no longer represents a specific object called a ball but 
represents a whole class of specific objects which can be referred to as ball. Given this evolution 
of experiences, representational meaning come first, followed by the formation of a concept. As 
we gain more experience with objects most words take on meanings, which are conceptual in 
nature. That is the names assigned to represent objects or ideas in our experiential base are 
typically conceptual in nature. As concepts are formed other exceptional features can be taken 
on to modify our notion of the concept. Hence they are flexible and generalizable. For example 
a football can take on many of the same characteristics as a ball since it shares similar properties 
with respect to how it is used and manipulated although its shape is a not round but elliptical. 
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The elliptical shape is an exception but can easily be taken on to further specify the instances 
which can be classified as ball. 

The third type of meaning employing Ausubel’s scheme is called propositional meaning. 
Propositional meaning expresses a relationship. The relationship is a comment about something. 
Propositional meaning is formed by a combination of concepts that are combined to each other 
such that a new idea is formed. This new idea is more than the sum of its component concepts. 
For example the proposition “semantic network”, is made up of two concepts which means 
something more than the concept “semantic” and “network” on their own. When these concepts 
are combined they refer to a web of associations between words, that defines the meaning of the 
words relative to other words in the neighborhood of words to which they are associated. Other 
examples of propositional meaning may define the relationship between mass and energy or 
between heat and volume etc. Most English sentential expressions yield prepositional 
representations in order to establish meaning. 

Anderson and Bower [10], Quillian [11] and [12] and Kintsch [13] and [14] all support the 
notion of the propositional representation and the concept for establishing meaning in memory, 
although they may use different verbal referents. For example, Quillian [12] refers to “property 
information” as the basic building block for meaning. This property information is essentially a 
labeled association or a relation, a proposition. Examples of such property information would 
consist of; verb phrases, relative clauses, adjectival or adverbial modifiers or any verb and its 
object. The other fundamental units are referred to as types and tokens, a token being a member 
of a type. That is types represent classes or concepts and tokens represent characteristics or 
modifiers of a type. Tokens themselves may serve as types subsuming other tokens, which in 
turn modify them. 

The other theorists reviewed also propose that propositions and concepts can be nested within 
other concepts or propositions forming a heterarchical network representing all of the potential 
associations of concepts and relations. All agree on some form of subset-superset structuring of 
concepts, which forms a hierarchy of meaning typically in a top down fashion moving from 
general to specific. That is with experience meaning becomes organized moving from high level 
concepts or propositions to low level instances or instantiations of the concepts or propositions. 
Quillian additionally proposes that the attributes or tokens of a type can take on different weights 
which indicate how indicative that attribute is of identifying membership to a type. Highly 
weighted tokens are more predictive of a type’s membership than low weighted tokens. This 
allows for considerable flexibility in making predictions regarding membership to a class or type. 
Anderson [7] and Kintsch [13] also support this notion of strength of association or activation 
between attributes and their respective associations. 

With respect to differing types of meaningful representations, then all of the theorists are in 
agreement as to prepositional and conceptual classes of meaning although they have not 
explicitly identified them as classes as has Ausubel. Most, if not all, agree that the most 
prevalent forms of memory for meaning consist of concepts and propositions idealized as a 
network of associations made up of concepts and the myriad of relationships they can form 
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dependent upon the context in which they are used. The context of textual information therefore 
is also an important component in establishing the meaning of a target of search. This context 
forms the neighborhood of words, which provide meaning for the target under search and would 
govern what gets visited. 

3.2.3 Contextual Relations 

These propositional relations and subset-superset hierarchies forming concepts defines what is 
referred to as the context of relations or associations between the words. It is this context of 
relations and associations, which forms the meaning of an expression. The search for 
information consistent with a users understanding of the meaning of the target must then 
incorporate those word units and their associations making up this context of relations. All of 
the theorists reviewed agree that such associations are what provides for meaning in human 
memory. Therefore, the context of relations or associations must be defined by a user to 
represent the users meaning for the target of search. Identifying the kinds of associations and 
relations would then define this context and represent the meaning of the information targeted for 
search by the user. 


First of all as indicated by these theorists the meanings of concepts are represented as a hierarchy 
of associations between the concept name and the attributes which define the meaning of the 
concept. In turn the names of attributes and the name of the concept may have representational 
equivalents. That is they may be known by different names. The structure for representing a 
concept would then look like a tree. The concept name would be the top level node of the tree 
and the attribute names would serve as the first layer of nodes linked by downstream arches from 
the concept name node as in Figure 7.0 



Representational 

Equivalents 


Figure 7.0 Nodes and links representing a concept and its defining attributes along with 
representational equivalents. 


The nodes of Figure 7.0 would represent word units which represent the verbal names assigned 
as the concept name, the names of attributes and the representational equivalents to the concept 
name and attribute names. The attributes are ANDed together representing a conjunction of 
terms which are the defining characteristics of the concept. A concept name and or the names of 
its defining attributes may also have equivalent representations. These would be different words 
that could be used to define the same characteristics or the same concept name. They would be 
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represented as a disjunction of terms or they would be ORed. These names for the concept and 
the names for the attributes would be elicited from the user along with any of their 
representational equivalents. If the user can not recall representational equivalents then the 
system can make use of Word Net. Word Net is an ontology of English words which provides 
alternative representations for English language terms and currently consists of over 100,000 
words and growing. It can be downloaded free from the DARPA web site or from Princeton 
University. Equivalent terms for the names provided by the user can be retrieved and examined 
by the user for their appropriateness as representational equivalents. It may be that the concept 
name and its equivalents along with the attributes of that concept and their equivalents can be 
automatically retrieved and inserted from Word Net into the users representation of the meaning 
of the target. The user could then inspect this representation and edit it if desired. 

During the process of building this concept tree, the attributes may also represent concepts, 
which in turn must be decomposed into their associated attributes. The process would be 
continued until the user makes a judgment that no further decomposition is required. In addition 
to specifying the concept names and attribute names the user would also be required to indicate 
the strength of the relation between the attributes and their associated concept. This feature is 
borrowed from the concepts of associative strength or activation from Anderson [8] and [9] from 
Kintsch [13] and [14] and from Quillian [11] and [12], In many cases a search which is being 
conducted comes across documentation which does not explicitly mention the concept name or 
its equivalents but does mention some number or all of the attributes. This would imply that the 
document is associated with the concept dependent upon the amount of evidence found related to 
the attributes mentioned. The weights or strength of association between the attributes and the 
concept name could then be used to determine if the document is truly concerned with the 
concept in question. If attributes with high valued weights relative to their association with the 
concept are found then the evidence is in favor of recognition of the concept. If not then the 
concepts relation to the document would be questionable. The amount of evidence in favor of a 
concept would be specified by the user. 

If the subject of a users search is a propositional expression or a combination of concepts, then 
the user would be requested to enter the expression into the system which best represents the 
users understanding of the target for search. This would more than likely not consist of more 
then three or four words, which is the limitation of most phrases. Any articles or prepositions 
would be ignored by the system and the concepts making up the expression would become the 
topic for decomposition. Again the user would be requested to supply alternative expression 
equivalent to that originally specified. For example, one may want to conduct a search for the 
proposition “semantic representation”. This could equivalently be expressed as the “structure of 
declarative memory”, or the “representation of meaning in memory”, or “cognitive models of 
meaning”, etc. Each of these propositions equally (in my mind anyway) represent the meaning 
of what I am searching for. These propositions are made up of a combination of concepts each 
of which independently does not refer to the target of the search. However their conjunction 
would more then likely get me to the specific information I am looking for. However, the 
combination of words used in different documents may not be the same as any one combination 
or expression listed. Therefore one would need to decompose each concept in each proposition 
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in a similar fashion as that described for a single concept term. If any conjunction of terms 
representing the various representational equivalents of the concepts making up these equivalent 
expressions is found then this would indicate that the meaning of what I am looking for is 
contained in the document or database. 

Again the process for representing the meaning of a proposition could be aided by retrieval of 
individual concepts and attribute names from Word Net. If this is the case then the task becomes 
less difficult for the user. However in many technical domains the terms used would probably 
not be represented in Word Net and the user would have to be guided through a process of 
defining referential equivalents to their expressions and to decomposing each into their 
component parts. The component parts must then further be decomposed as singular concepts 
and weights would be applied to their links as before. The search process then amounts to a 
search for evidence in support of a specific meaning. The meaning is inferred from this evidence 
regarding the conjunction of disjunctive terms. That is several trees would be constructed to 
represent each concept and its associated disjunctive equivalents. If any of the attribute names or 
their equivalent representations are true then the attribute is true. Likewise if any of the concept 
terms or their equivalent representations are true then the concept term is true. If the concept 
terms are all true then the proposition is true. The conjunction of these concepts would represent 
the prepositional relationship of choice. Partial truth may also be represented by propagating 
forward the weights assigned to any attributes strength of association with any given concept. 

The user can set the degree of truth which is acceptable during a search for meaning. Thus even 
partial truth in satisfying the meaningfulness of a target can be retrieved and examined by the 
user. Any documents which are found as a result of conducting these multiple searches on the 
conjoint terms and their representational equivalents, could then be compared for their 
intersection. These would most likely be the documents which would most closely match the 
meaning of the target of interest. 

This type of search bears similarity to the use of Bayesian equations to classify text documents. 
The equations are made up of expressions representing the probability of a specific class of 
document being found in general and the cross product of the probabilities of some number of 
words being conjointly found within a given document. This type of algorithm has been found 
to be capable of reading documents on news group pages and correctly classifying them into one 
of twenty different categories of news Mitchell (1999). The Bayesian equation for classifying 
such documents is as follows: 


Eq.[2] P(Di \Ai= Vjj, ,A„ = V „ j )= P(D { ) UP(A X =Vij) 


This simply states that the probability of a document belonging to a specific class given that 
specific words are evidenced is equal to, the probability that the class of document would occur 
in your experience, in general, times the cross product of the probability of the words occurring. 
This assumes that words can occur independent of the occurrence of other words in a document. 
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Although this is obviously false it works very well. It is referred to as a Naive Bayes 
Assumption. 

Consequently, meaningful search is likened to classifying documents, which belong to certain 
categories of information, very much like the type of meaningful search, which is being proposed 
herein. Consequently there is conceptual evidence due to the similarity to the Bayesian 
classification scheme that such a process would work to successfully retrieve meaningful 
information 

4. CONCLUSIONS/FUTURE RESEARCH 

Future research can focus upon the first alternative specified as defined in the scheme presented 
in Figure 2.0. Another alternative is to develop the semantic search capability, evaluating it 
empirically with users to determine agreement between the searched for target and the users 
confirmation or disconfirmation of information retrieved. If the empirical evidence supports the 
validity of the semantic search capability then it would be developed and integrated with the 
Work Unit Warehouse and its classification algorithm. A third option would be to develop the 
first alternative and also conduct research to develop the semantic search capability. The 
semantic search capability would have widespread use across the various divisions of NASA as 
well as having considerable commercialization potential. 
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